A Fuzzy Clustering Approach to Filter Spam E-Mail
نویسنده
چکیده
Spam email, is the practice of frequently sending unwanted email messages, usually with commercial content, in large quantities to a set of indiscriminate email accounts. However, since spammers continuously improve their techniques in order to compromise the spam filters, building a spam filter that can be incrementally learned and adapted became an active research field. Researches employed machine learning techniques which have been widely used in solving similar problems like document classification and pattern recognition, such as Naïve Bayesian, and Support Vector Machine. In this Paper, we examine the use of the fuzzy clustering algorithm (Fuzzy C-Means) to build a spam filter. The proposed use of the Fuzzy has been tested on different data set sizes collected from Spam assassin corpora by real user’s emails. After testing Fuzzy C-Means using Heterogeneous Value Difference Metric with variable percentages of spam and using a standard model of assessment for the spam problem, we demonstrate the potential value of our approach.
منابع مشابه
A Classification Method for E-mail Spam Using a Hybrid Approach for Feature Selection Optimization
Spam is an unwanted email that is harmful to communications around the world. Spam leads to a growing problem in a personal email, so it would be essential to detect it. Machine learning is very useful to solve this problem as it shows good results in order to learn all the requisite patterns for classification due to its adaptive existence. Nonetheless, in spam detection, there are a large num...
متن کاملA New Hybrid Approach of K-Nearest Neighbors Algorithm with Particle Swarm Optimization for E-Mail Spam Detection
Emails are one of the fastest economic communications. Increasing email users has caused the increase of spam in recent years. As we know, spam not only damages user’s profits, time-consuming and bandwidth, but also has become as a risk to efficiency, reliability, and security of a network. Spam developers are always trying to find ways to escape the existing filters therefore new filters to de...
متن کاملA Trainable Fuzzy Spam Detection System
Electronic mail (e-mail) has been considered as one of the most convenient way to communicate among the users in the Internet. The rapid growth of users in the Internet and the abuse of e-mail by unsolicited users cause an exponential increase of e-mails in user mailboxes. Although there are several systems which use different AI techniques to filter out spam, there is hardly any system develop...
متن کاملFuzzy Clustering based on Semantic Body and its Application in Chinese Spam Filtering
E-mail’s text is the main body of an E-mail. Its content is reflected by semantic body formed by a large number of semantic elements, so it is the most authoritative and effective to study semantic body information of spam when analyzing its text. Firstly, this paper takes the advantage of HowNet in analysis of semantic element and analyze semantic bodies in email text, then proposes the method...
متن کاملA Genetic Based Approach to Optimize The Fuzzy Clustering Spam Filters
Spam email, is the practice of frequently sending unwanted email messages, usually with commercial content, in large quantities to a set of indiscriminate email accounts. Effort has been put into solving the spam problem from many directions. We examine the use of an optimizing technique to detect the best value of the Fuzzy Clustering Parameters which are the number of clusters and the Fuzzifi...
متن کامل